Document-Level Machine Translation as a Re-translation Process
نویسندگان
چکیده
Most of the current Machine Translation systems are designed to translate a document sentence by sentence ignoring discourse information and producing incoherencies in the final translations. In this paper we present some documentlevel-oriented post-processes to improve translations’ coherence and consistency. Incoherences are detected and new partial translations are proposed. The work focuses on studying two phenomena: words with inconsistent translations throughout a text and also, gender and number agreement among words. Since we deal with specific phenomena, an automatic evaluation does not reflect significant variations in the translations. However, improvements are observed through a manual evaluation.
منابع مشابه
A new model for persian multi-part words edition based on statistical machine translation
Multi-part words in English language are hyphenated and hyphen is used to separate different parts. Persian language consists of multi-part words as well. Based on Persian morphology, half-space character is needed to separate parts of multi-part words where in many cases people incorrectly use space character instead of half-space character. This common incorrectly use of space leads to some s...
متن کاملApplying Machine Translation to Two-Stage Cross-Language Information Retrieval
Cross-language information retrieval (CLIR), where queries and documents are in di erent languages, needs a translation of queries and/or documents, so as to standardize both of them into a common representation. For this purpose, the use of machine translation is an e ective approach. However, computational cost is prohibitive in translating large-scale document collections. To resolve this pr...
متن کاملDocument-level Re-ranking with Soft Lexical and Semantic Features for Statistical Machine Translation
We introduce two document-level features to polish baseline sentence-level translations generated by a state-of-the-art statistical machine translation (SMT) system. One feature uses the word-embedding technique to model the relation between a sentence and its context on the target side; the other feature is a crisp document-level token-type ratio of target-side translations for source-side wor...
متن کاملTranslation Technology Tools and Professional Translators’ Attitudes toward Them
Today technology is an integral part of professional translation; and it is generally assumed that translators’ attitudes toward translation technology tools influence their interaction with technology (Bundgaard, 2017). Therefore, the present two-phase study seeks to shed some light on what translation technology tools are and how professional translators feel toward them. The research method ...
متن کاملNovel Document Level Features for Statistical Machine Translation
In this paper, we introduce document level features that capture necessary information to help MT system perform better word sense disambiguation in the translation process. We describe enhancements to a Maximum Entropy based translation model, utilizing long distance contextual features identified from the span of entire document and from both source and target sides, to improve the likelihood...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Procesamiento del Lenguaje Natural
دوره 53 شماره
صفحات -
تاریخ انتشار 2014